钢筋学习中的时间抽象是代理学习和使用高级行为的能力,称为选项。选项 - 批评架构提供了一种基于渐变的端到端学习方法来构建选项。我们提出了一种基于关注的扩展到该框架,这使得代理能够学会在观察空间的不同方面上集中不同的选择。我们表明这导致行为各种选项,这些选项也能够出现状态抽象,并防止在选项 - 评论家中发生的选项统治和频繁选项切换的退化问题,同时实现类似的样本复杂性。我们还通过不同的转移学习任务,展示了学习选项的更有效,可意识形态和可重复使用的性质。实验结果在一个相对简单的四室环境和更复杂的ALE(街机学习环境)展示了我们方法的功效。
translated by 谷歌翻译
我们提出了一种针对8位神经网络加速器的新型8位量化感知训练(S8BQAT)方案。我们的方法灵感来自Lloyd-Max压缩理论,其实际适应性适应训练期间可行的计算开销。通过量化质心源自32位基线,我们使用多区域绝对余弦(MRACOS)正规器增强训练损失,该培训将重量汇总到其最近的质心,有效地充当伪压缩机。此外,引入了定期调用的硬压缩机,以通过模拟运行时模型重量量化来提高收敛速率。我们将S8BQAT应用于语音识别任务,使用经常性神经网络TransDucer(RNN-T)体系结构。使用S8BQAT,我们能够将模型参数大小增加,以将单词错误率相对降低4-16%,同时仍将延迟提高5%。
translated by 谷歌翻译
Open Arms是一个新型的开源平台,该平台具有现实的人类机器人手和手臂硬件,并具有28个自由度(DOF),旨在扩展人形机器人抓握和操纵的能力和可访问性。敞开的武器框架包括开放的SDK和开发环境,仿真工具和应用程序开发工具,以构建和操作敞开的武器。本文描述了这些手控制,感应,机制,美学设计以及制造业及其现实世界的应用,并使用远程手工护理机器人进行了现实应用。从2015年到2022年,作者设计并确定了敞开的武器的制造作为低成本,高功能机器人手臂硬件和软件框架,以服务类人机器人的机器人应用以及对低成本假肢的紧急需求,作为一部分汉森机器人索菲亚机器人平台。使用消费产品制造的技术,我们着手定义模块化的低成本技术,以近似人类手的灵敏性和灵敏度。为了证明我们的手的敏捷性和控制,我们提出了一种生成握把残留的CNN(GGR-CNN)模型,该模型可以从实时速度(22ms)的各种对象的输入图像中生成强大的抗抑制剂。我们使用在标准的康奈尔(Cornell)握把数据集上使用模型体系结构实现了92.4%的最新准确性,该数据集包含各种各样的家庭对象。
translated by 谷歌翻译
Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing surgical actions as triplets of <instrument, verb, target> combination delivers comprehensive details about the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet information. In this paper, we present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the challenge organizers and 19 new deep learning algorithms by competing teams are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches, performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition which is of utmost importance for the development of AI in surgery.
translated by 谷歌翻译